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METHODS AND APPARATUS FOR LOSSLESS DATA HIDING 
CROSS REFERENCE TO RELATED APPLICATIONS 

[0001]. This application claims the benefit of U.S. Provisional Patent Application No.: 

60/ , , entitled Lossless Image Data Hiding, by inventors Guorong Xuan and Yun- 

Qing Shi and, filed December 17, 2002, the entire disclosure of which is hereby 
incorporated by reference. 

BACKGROUND 

[0002], This application is directed to methods and apparatus for data hiding in an 
image and, more particularly, to lossless data hiding using the integer wavelet transform. 

[0003]. In t he field o f d ata h iding, p ieces o f i nformation r epresented b y t he d ata are 
hidden in the cover media (e.g., a pixel image). In some applications, people care about 
whether the embedded data are perceptible with the cover media. That is, the hidden data 
and the cover media may be closely related. For this type of data embedding, it may be 
desirable to invert the marked media back to the original cover media after the hidden data 
have been retrieved. For example, perceptual transparency and inversion of marked media 
may be desirable for applications such as medical diagnosis and law enforcement. The 
marking techniques satisfying these requirements are referred to as lossless, distortion-free, 
and reversable or invertible data hiding techniques. 

[0004]. Although most of the current digital watermarking algorithms are not lossless, 
some recent marking techniques have been reported as being lossless. For example, two 
methods carried out in the image spatial domain purport to be lossless. The details of these 
methods may be found in U.S. Patent No. 6,278,791 (the entire disclosure of which is 
hereby incorporated by reference) and J. Fridrich, M. Goljan and R. Du, "Invertible 
Authentication," Proc. SPIE, Security and Watermarking of Multimedia Contents, pp. 197- 
208, San Jose, CA, (Jan. 2001). A purportedly lossless marking technique has also been 
developed in the transform domain, as is discussed in detail in B. Macq and F. Deweyand, 
"Trusted Headers For Medical Images," DFG VIII-D II Watermarking Workshop, 
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Erlangen, Germany, (Oct. 1999). As these techniques are directed toward data 
authentication, instead of data embedding, the amount of hidden data that may be achieved 
is quite limited. Another lossless marking technique that may be suitable for some higher 
quantities of data embedding has also been developed and is discussed in detail in U.S. 
Patent Application No.: 2003/0081809 (the entire disclosure of which is hereby 
incorporated by reference). The amount of hidden data achievable by this technique, 
however, is still not large enough for many applications, such as medical applications. 
Indeed, t he p ay-load r anges from 3 ,000 b its t o 2 4,000 b its for a 5 1 2 x 5 1 2 x 8 g rayscale 
image. 

[0005]. Accordingly, there are needs in the art for new methods and apparatus for 
achieving lossless marking that can embed a relatively large amount of data. 

SUMMARY OF THE INVENTION 

[0006]. In accordance with one or more aspects of the present invention, an data hiding 
encoding method includes: subjecting an original, pixel domain image to an Integer 
Wavelet Transform (IWT) to obtain a matrix of wavelet coefficients; selecting at least one 
bit plane between a least significant bit plane and a most significant bit plane of the matrix 
of wavelet coefficients; compressing the at least one selected bit plane to produce free 
space in the at least one selected bit plane; embedding hidden data in the free space of the 
at least one compressed bit plane; and subjecting the at least one embedded bit plane and 
the other bit planes to an Inverse IWT to produce a marked pixel domain image. 

[0007]. In accordance with one or more further aspects of the present invention a 
hidden d ata d ecoding m ethod i ncludes: s ubjecting a m arked p ixel d omain i mage t o a n 
Integer Wavelet Transform (IWT) to obtain a matrix of wavelet coefficients; selecting at 
least one bit plane between a least significant bit plane and a most significant bit plane of 
the matrix of wavelet coefficients that contains hidden data; extracting the hidden data 
from the at least one bit plane; decompressing the at least one bit plane; and subjecting all 
bit planes to an Inverse IWT to produce an original pixel domain image. 
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[0008). In accordance with one or more further aspects of the present invention, the 
methods and apparatus for controlling cache memories described thus far and/or described 
later in this document, may be achieved utilizing suitable hardware, such as that shown in 
the drawings hereinbelow. Such hardware may be implemented utilizing any of the known 
technologies, such as standard digital circuitry, analog circuitry, any of the known 
processors that are operable to execute software and/or firmware programs, one or more 
programmable digital devices or systems, such as programmable read only memories 
(PROMs), programmable array logic devices (PALs), any combination of the above, etc. 
Further, the methods of the present invention may be embodied in a software program that 
may be stored on any of the known or hereinafter developed media. 

[0009]. Other aspects, features and advantages of the present invention will become 
apparent to those skilled in the art when the description herein is taken in conjunction with 
the accompanying drawing. 

BRIEF DESCRIPTION OF THE DRAWING 

[0010]. For the purposes of illustration, there are forms shown in the drawings that are 
presently preferred, it being understood, however, that the invention is not limited to the 
precise arrangements and instrumentalities shown. 

[0011]. FIG. 1 is a block diagram illustrating the concept of bit planes as used in pixel 
(spatial) domain images and/or frequency domain images in accordance with one or more 
aspects of the present invention; 

[0012]. FIG. 2 is a block diagram of an encoding system for embedding hidden data in 
a pixel domain image in accordance with one or more aspects of the present invention; 

[0013]. FIG. 3 is a block diagram of a decoding system for extracting embedded hidden 
data from the pixel domain image in accordance with one or more aspects of the present 
invention; 
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[0014], FIG. 4 is an illustration of a pixel domain image representing data that were 
hidden in a number of test images in accordance with one or more aspects of the present 
invention; 

[0015]. FIG. 5 is an illustration of a pixel domain image in which no hidden data have 
been embedded; 

[0016], FIG. 6 is an illustration of the pixel domain image of FIG. 5 in which hidden 
data have been embedded in accordance with one or more aspects of the present invention; 

[0017]. FIG. 7 is an illustration of test results indicating peak signal to noise ratios 
(PSNR) and corresponding hidden data payload sizes for several test images in which 
hidden data have been embedded in accordance with one or more aspects of the present 
invention; 

[0018]. FIG. 8 is an illustration of comparisons between hidden data payload sizes for 
several embedding techniques, including that in accordance with one or more aspects of 
the present invention; 

[0019]. FIG. 9 is an illustration of comparisons between an unmodified histogram and a 
modified h istogram o f a n i mage i n a ccordance with o ne o r m ore a spects o f t he p resent 
invention; 

[0020]. FIG. 10 is an illustration of a specific example of modifying a histogram to 
achieve r esults s imilar t o t hat o f F IG. 9 i n a ccordance w ith o ne o r m ore a spects o f t he 
present invention; 

[0021]. FIG. 1 1 is an illustration of the specific histogram data before, during, and after 
modification in accordance with one or more aspects of the present invention; and 

[0022]. FIG. 12 is an illustration of how information concerning the modification to the 
histogram may be recorded for later use in a post-processing step in accordance with one 
or more aspects of the present invention. 
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DETAILED DESCRIPTION OF THE INVENTION 

[0023], In general, the present invention is directed to methods and apparatus for hiding 
(embedding) a relatively large amount of data in an image, where the original image may 
be recovered without any (or any substantial) distortion from the marked image after the 
hidden data have been extracted. The methods and apparatus hide the data and overhead 
data, representing bookkeeping information, into high frequency sub-bands of one or more 
middle bit-planes of integer wavelet coefficients of the original pixel domain (spatial 
domain) image. 

[0024]. Prior to discussing further details concerning the various aspects of the present 
invention, reference will now be made to FIG. 1, which is a block diagram illustrating the 
concept of bit planes as used in pixel domain images and/or frequency domain images. As 
shown in the illustrated example, an image may be represented by N x M pixels. For 
simplicity of discussion, each pixel may be represented by 8 bits to quantify the grayscale 
of the image. It is noted that any number of bits may be used to represent each pixel. In 
this example, therefore, the original pixel image may be arranged into eight separate bit 
planes: a least significant bit-plane (the 1 st bit-plane), a next least significant bit-plane (the 
2 nd bit-plane), etc., and a most significant bit-plane (the 8 th bit-plane). The 1 st bit-plane 
contains the least significant bit of each 8 bit word representing each pixel. The 2 nd bit- 
plane contains the next least significant bit of each 8 bit word representing each pixel, and 
so on. 

[0025]. It has been found through experimentation and study of commonly used 
grayscale images that binary 0s and Is are almost randomly, equally likely, distributed in 
the first several "lower" bit-planes. The bias between 0s and Is starts to gradually increase 
in the several "higher" bit-planes, although this bias is not very large. This kind of bias 
indicates the existence of redundancy in the bits, and in accordance with various aspects of 
the invention, implies that one may compress bits in a bit-plane or more than one bit-plane 
so as to leave free space to hide data. To achieve a larger bias as between 0s and Is, which 
may be exploited to achieve larger free space, image transforms may be employed. For 
example, an Integer Wavelet Transform (IWT) may be used to transform an image form 
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the pixel domain to the frequency domain. For example, the CDF(2, 2) and similar series 
used in the JPEG2000 standard may be employed. The forward transform and the inverse 
transform of the CDF(2, 2) Integer Wavelet Transform are as follows: 

a. Forward Transform 

2. Splitting: Si<— x 2i Ddi<— x 2i+ i 

3. Dual lifting: dj^di-{(Sj+s i+ i)/2} 

4. Primary lifting: Sj<-Sj+{(dj_i+di)/4} 

a. Inverse Transform 

5. Inverse primal lifting: Sj*-Si-{(dj_i+di)/4} 

6. Inverse dual lifting: dj<— dj+{(sj+Sj+i)/2} 

7. Merging: x 2i <— SiDx 2 j +1 <— dj 

[0026]. Further details of the IWT maybe found, for example, in A. R. Calderbank, I. 
Daubechies, W. Sweldens and B. Yeo, "Wavelet Transforms That Map Integers To 
Integers," Applied and Computational Harmonic Analysis, Vol. 5, No. 3, pp.332-369 
(1998). The IWT is a desirable transform because it can reveal more redundancy to embed 
more data while avoiding round-off error. 

[0027]. Experimental study of commonly used images has revealed that a larger bias 
may be achieved between binary 0s and Is starting from the 2 nd bit-plane of the IWT 
coefficients of a pixel domain image, as compared to the bias of the pixel domain image 
itself. Further, the higher the bit-plane, the larger the apparent bias. Bit plane allocation 
relates to the allocation of the data to be embedded into sub-bands and corresponding bit- 
planes of the wavelet transform. The LSB (least significant bit) replacement method in the 
wavelet domain appears to perform better than that in the spatial domain because the 
wavelet i s closer t o h uman v isual sy stem ( HVS). T he H VS m odel p oints o ut d ifferent 
insensitivities among different level sub-bands. The lower level a sub-band belongs to, the 
more insensitive to the HVS it is. In the same level, the HH sub-band is the least sensitive, 
the HL and the LH sub-bands are the next, and the LL sub-band is the most. More 
insensitive to HVS means that more data can be embedded without causing notable visual 
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artifacts. Thus, a change made in a high bit-plane will lead to a larger distortion. Thus, in 
order to have the marked image perceptually the same as the original image, it is preferred 
that the hidden data is embedded in a "middle" bit-plane in the IWT domain. 

[0028]. It is also desirable to achieve high PSNR (peak signal to noise ratio) in the 
marked image. To achieve this feature, it is preferred that the data are embedded in the 
high frequency sub-bands in accordance with various aspects of the present invention. It is 
most preferred that the embedded data are hidden in the LHi, HLi and HHi sub-bands of 
one or multiple middle bit-planes. 

[0029], Reference is now made to FIG 2, which is a block diagram of an encoding 
system 100 for embedding hidden data in a pixel domain image in accordance with one or 
more aspects of the present invention. The system 100 includes a pre-processing unit 102, 
-an IWT unit 104, a compression unit 106, an embedding unit 108, and an IIWT unit 110. 

[0030]. The pre-processing unit 102 is preferably operable to modify a histogram of the 
original image. Indeed, when certain aspects of the present invention are employed to 
embed hidden data in an image, it is possible for the marked image to have one or more 
pixels represented by overflow/underflow values, such as values representing the grayscale 
of the image. In particular, an overflow/underflow grayscale value may exceed an upper 
bound and/or a lower bound defined by the number of bits representing each pixel. For 
example, when the pixels of an image are represented by 8-bit words, an overflow gray 
scale value may exceed 255 (the upper bound of an 8-bit word). Similarly, an underflow 
grayscale value may fall below a lower bound of zero. It is believed that the possibility for 
overflow/underflow is caused by changes taking place in the selected bit plane, such as 
exists when the high frequency I WT c oefficients are modified to include the embedded 
data. 

[0031]. In this regard, the pre-processing unit 102 is preferably operable to remove 
values at the extremes of the histogram, such as at zero (or near zero) and at 255 (or near 
255). Indeed, it is understood that a histogram may be represented by points plotted in a 
Cartesian coordinate system in which discrete intensity levels exist along an ordinate axis 
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and numbers of pixels having such intensity levels exist along an abscissa axis. The pre- 
processing unit 102 preferably operates to modify any data plotted near extremes of the 
ordinate axis toward more moderate locations. This advantageously mitigates against any 
overflow/underflow that may occur when modifications are made to the selected bit plane, 
particularly at the high frequency IWT coefficients. 

[0032], The pre-processed original image (i.e., the image having a modified histogram) 
is passed on to the IWT unit 104. It is noted that information concerning exactly how the 
histogram was modified by the pre-processing unit 102 is also forwarded to the embedding 
unit 108 such that the marked image will contain the information. In this way, a post- 
process may reverse the histogram modification in order to recover the hidden data and 
recover the original image. The IWT unit 104 is preferably operable to subject the original 
image (as modified by the pre-processing unit 102) to the well-known Integer Wavelet 
Transformed in order to produce a matrix of wavelet coefficients. 

[0033]. One or more of the bit-planes of the wavelet coefficients are passed to the 
compression unit 106 and the remaining bit-planes of the matrix of wavelet coefficients are 
passed to the embedding unit 108. Indeed, at least one bit plane between a least significant 
bit plane and a most significant bit plane of the wavelet coefficients is selected to receive 
the hidden data. As discussed above, in order to have the marked image perceptually the 
same as the original image, it is preferred that the bit-plane selected to receive the hidden 
data is a "middle" bit-plane in the IWT domain. The compression unit 106 is preferably 
operable to subject the wavelet coefficients to an entropy coding algorithm. Although any 
of the known entropy coating algorithms may be employed, such as the arithmetic lossless 
coating algorithm, it is most preferred that the well-known JBIG lossless coding technique 
is employed because of its superior compression ratio. Further details concerning the 
arithmetic coding algorithm may be found in Y. Q. Shi and H. Sun, "Image and Video 
Compression for Multimedia Engineering," Boca Raton, FL: CRC, (1999). 

[0034]. A c ompressed b it s tream m ay b e p roduced b y the c ompression u nit 1 06, f or 
example, utilizing a zig-zag scanning pattern. 
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[0035]. The compressed bit stream, the information concerning the histogram 
preprocessing, the mark data, and an optional secret key are preferably provided to the 
embedding unit 108. The embedding unit 108 is preferably operable insert the hidden data 
and the information concerning pre-processing into the free space of the one or more 
selected bit planes. As discussed above, it is preferred that the hidden data are embedded 
in the high frequency sub-bands of the one or more selected bit planes. It is most preferred 
that the high frequency sub-bands are at least one of the LHi, the HLi, and the HHi sub- 
bands. The secret key may be utilized to define the organization and/or coding of how the 
hidden data and/or pre-processing data are embedded into the one or more selected bit 
planes. Thus, the hidden data may remain secret even if the algorithm for decoding a 
marked image is known. 

[0036]. Thereafter, the marked image in the frequency domain is transformed into the 
spatial domain utilizing the Inverse I nteger Wavelet Transform carried out by thellWT 
unit 110. 

[0037]. Reference is now made to FIG 3, which is a block diagram of a decoding 
system 200 for extracting embedded hidden data and an original spatial image in 
accordance with one or more aspects of the present invention. The decoding system 200 
includes an IWT unit 202, an extracting unit 204, a decompression unit 206, an IIWT unit 
208, and a post-processing unit 210. 

[0038]. The IWT unit 202 is preferably operable to subject the marked image to the 
Integer Wavelet Transform in order to convert the marked image from the spatial domain 
into the frequency domain and to produce a matrix of IWT coefficients. The extracting 
unit 204 is operable to remove the hidden data and the information concerning pre- 
processing (i.e., the modifications made to the histogram) from the matrix of IWT 
coefficients. It is noted that if a secret key was utilized to embed the data in the encoding 
system 100 (FIG 2) then the secret key may be necessary in order to extract such data at 
the decoder 200. 
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[0039]. The selected bit-plane or planes that were compressed in the encoding system 
100 are preferably subject to decompression 206 utilizing an appropriate inverse entropy 
coding algorithm. The output of the decompression unit 206, together with the matrix of 
coefficients from the IWT unit 202 produces a matrix of IWT coefficients that are input 
into the IIWT 208. The IIWT unit 208 converts the IWT coefficients from the frequency 
domain into the spatial domain, i.e., into a matrix of pixel values. The information 
concerning histogram modification carried out by the pre-processing unit 102 of the 
encoding system 100 is preferably utilized by the post-processing unit 210, which is 
basically an inverse algorithm to recover the original histogram information of the original 
image. 

[0040]. It is noted that the discussion herein concerning the hiding of data in bit planes 
representing the grayscale of an image may be readily applied by one skilled in the art to 
hiding data in bit planes representing the color information of an image. Indeed, the bit 
planes representing red, blue, and/or green (or any other color representation scheme) may 
be embedded with hidden data in accordance with the various aspects of the present 
invention. 

[0041]. The encoding and decoding techniques in accordance with various aspects of 
the present invention have been experimentally applied to a number of different images 
with successful results. For example, various encoding algorithms of the invention have 
been applied to grayscale images and medical images. FIG. 4, illustrates hidden data that 
were embedded into a number of images. The hidden data constitute a binary logo image, 
equivalent to a binary sequence of 23,040 bits. For comparison purposes, the well-known 
image of "Lena" illustrated in FIG. 5 has not been embedded with any hidden data. 

[0042]. This original Lena pixel image is 512x512x8 bits. Using the various features 
of the present invention, the data of FIG. 4 as well as book-keeping data (pre-processing 
histogram modification data), and the losslessly compressed data of one or more of the bit- 
planes of the Lena image were combined to produce the marked Lena image of FIG. 6. In 
particular, the original 5 th bit-plane of the IWT coefficients of the original Lena image was 
selected to receive the mark data and the pre-processing information. The 5 th bit-plane of 
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the IWT coefficients was subjected to arithmetic entropy coding to produce free space in 
that bit plane and all the data were embedded into the high frequency sub-bands of this bit 
plane. M ore p articularly, t he h idden d ata w ere e mbedded i nto t he h igh f requency s ub- 
bands LHi, HLi and HHi of the 5 th bit-plane. 

[0043]. Pre-processing was also carried out on the original Lena image using the 
following histogram modification algorithm: the lowest and the highest 16 grayscale 
values were mapped to grayscale values 15 and 240, respectively. In this way, the 
overflow/underflow is avoided. In order to recover the original image losslessly, the data 
representing the necessary book-keeping information were also hidden as overhead. 

[0044]. A secret key was also used to define the form of the embedded data. The secret 
key function used was y = (kO + kl x x) mod s, in which kO = 1030, kl = 289, s = 3 x 256 
x 256, and x, y are the coordinates in the 5 th bit-plane. 

[0045]. It is noted that there are no perceptible artifacts in the marked Lena image of 
FIG. 6, although as illustrated in FIG. 7, the PSNR of the marked image is not as 
particularly h igh. F IG. 7 a lso s hows t he P SNR a nd 1 ay 1 oad v alues d emonstrated o n a 
number of other images. Even though the PSNR of the marked pepper image is only 29.1 1 
dB, there were no any annoying structural interferences that could be observed. The 
experimental results demonstrated that the low PSNR was attributable to the histogram 
modification in the pre-processing stage. 

[0046]. FIG. 8 i llustrates a c omparison b etween e xisting 1 ossless m arking t echniques 
and the techniques enjoyed by the present invention in terms of pay-load. 

[0047]. Although the specific histogram modification technique described above may 
be used, any of the known histogram modification or grayscale mapping techniques may 
be used to prevent overflow/underflow in accordance with various aspects of the present 
invention. Further details regarding a preferred histogram modification technique will now 
be provided. In order to prevent overflow/underflow after the inverse wavelet transform, 
the pre-processing histogram modification preferably narrows the histogram from both 
sides as shown in FIG. 9. In narrowing down a histogram to the range G/2, 255-G/2, the 
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histogram modification information should be recorded as part of the embedded data. 
Thus, the embedded data include three parts: the watermark signal, bookkeeping 
information of the histogram modification, and losslessly compressed data from the 
original bit-planes. 

[0048]. In order to illustrate the histogram modification, an illustrative, simplified 
example will now be described in which the size of an original image is 6 x 6 with 8 = 2 
grayscales (6 x 6 x 3) as shown in FIG 10. From FIG. 10 and FIG. 1 1, it can be seen that 
the range of the modified histogram is from 1-6 instead of 0-7. In other words, no pixel 
assumes a grayscale of 0 and/or 7. After modification, the grayscale of 1 is merged into the 
grayscale of 2. The grayscale of 0 becomes the grayscale of 1. In the same way, the 
grayscale of 6 is merged into the grayscale of 5. And the grayscale of 7 becomes the 
grayscale of 6. The details of the differences in the histogram of the image before and after 
modification are shown in FIGS. 10 and 11. 

[0049]. With reference to FIG. 12 the bookkeeping information concerning what 
changes were made to the histogram illustrated. The scan sequence from the left hand side 
(101101) in FIG. 12 shows that both second and fifth "2" in FIG. 10 are "1" in FIG. 10 
originally by scanning as follows: (< x = 5, y = 1 >,< (x = 1 , y = 4>). The scan sequence 
from the right hand side (11011) in FIG. 12 shows that the third "5" by scanning 
(<x=4,y=2>) in FIG. 10 is "6" in FIG. 10 originally. 

[0050]. Using the alternative pre-processing histogram modification technique 
described above and by using more than one bit plane in which to hide data, a significant 
improvement in PSNR and payload may be achieved as compared to the results of FIG. 7 
above. Indeed, as to the Lena image: with a PSNR of 35 dB, a payload of 0.5 bpp (bits 
per pixel), i.e., 131,072 bits, can be hidden inside a 5 12x512 grayscale image. Witha 
PSNR of 44 dB, a payload of 0.15 bpp, i.e., 39,321 bits, can be hidden inside the 512x512 
grayscale image. 

[0051], Advantageously, various aspects of the present invention permit the hiding 
(embedding) of a relatively large amount of data in an image, where the original image 
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may be recovered without substantial (or any) distortion from the marked image after the 
hidden data have been extracted. 

[0052]. Although the invention herein has been described with reference to particular 
embodiments, it is to be understood that these embodiments are merely illustrative of the 
principles and applications of the present invention. It is therefore to be understood that 
numerous modifications may be made to the illustrative embodiments and that other 
arrangements may be devised without departing from the spirit and scope of the present 
invention as defined by the appended claims. 
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